Extraction of user specified web knowledge using Spatial Data Mining

نویسندگان

  • Priyanka Tiwari
  • Shri Ram
چکیده

Nowadays the World Wide Web has becoming one of the most comprehensive information resources. It probably, if not always, covers the information need for any user. Those differences make it challenging to fully use Web information in an effective and efficient manner. Web mining is the application of data mining techniques to extract knowledge from web data including web documents, hyperlinks, log usage of website etc. In this paper we extract data from web using spatial data mining. Spatial data mining is the process of trying to find patterns in geographic data. Spatial data mining is the application of data mining techniques. Spatial data mining follows along the same functions in data mining, with the end objective to find patterns in geography. In this paper we provide an introduction of spatial data mining as well as web. Then we focus on how data is extracted from web using some preprocessing techniques or some steps. It describes a method to extract useful information from a web page using spatial data mining. We are extracting hyperlinks and email from single and multiple websites that’s why it is using spatial data mining because in spatial mining data is extracted from different locations. Different websites will have different web servers means different locations. This method includes some preprocessing steps to extract information. That extracted information will be knowledge.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences

Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...

متن کامل

Use of Semantic Similarity and Web Usage Mining to Alleviate the Drawbacks of User-Based Collaborative Filtering Recommender Systems

  One of the most famous methods for recommendation is user-based Collaborative Filtering (CF). This system compares active user’s items rating with historical rating records of other users to find similar users and recommending items which seems interesting to these similar users and have not been rated by the active user. As a way of computing recommendations, the ultimate goal of the user-ba...

متن کامل

Data Extraction using Content-Based Handles

In this paper, we present an approach and a visual tool, called HWrap (Handle Based Wrapper), for creating web wrappers to extract data records from web pages. In our approach, we mainly rely on the visible page content to identify data regions on a web page. In our extraction algorithm, we inspired by the way a human user scans the page content for specific data. In particular, we use text fea...

متن کامل

Knowledge Mining with ELM System

The problem of knowledge extraction from the data left by web users during their interactions is a very attractive research task. The extracted knowledge can be used for different goals such as service personalization, site structure simplification, web server performance improvement or even for studying the human behavior. We constructed a system, called ELM (Event Logger Manager), able to reg...

متن کامل

Granularity Analysis for Spatio-Temporal Web Sensors

In recent years, many researches to mine the exploding Web world, especially User Generated Content (UGC) such as weblogs, for knowledge about various phenomena and events in the physical world have been done actively, and also Web services with the Web-mined knowledge have begun to be developed for the public. However, there are few detailed investigations on how accurately Web-mined data refl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012